-
Proceedings of the National Academy of... Jan 2017The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of...
The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of parasites that affect human and nonhuman populations. This paper develops a quantitative theory to predict the spatial distribution of parasites based on the distribution of parasites in hosts and the spatial distribution of hosts. Four models are tested against observations of metazoan hosts and their parasites in littoral zones of four lakes in Otago, New Zealand. These models differ in two dichotomous assumptions, constituting a 2 × 2 theoretical design. One assumption specifies whether the variance function of the number of parasites per host individual is described by Taylor's law (TL) or the negative binomial distribution (NBD). The other assumption specifies whether the numbers of parasite individuals within each host in a square meter of habitat are independent or perfectly correlated among host individuals. We find empirically that the variance-mean relationship of the numbers of parasites per square meter is very well described by TL but is not well described by NBD. Two models that posit perfect correlation of the parasite loads of hosts in a square meter of habitat approximate observations much better than two models that posit independence of parasite loads of hosts in a square meter, regardless of whether the variance-mean relationship of parasites per host individual obeys TL or NBD. We infer that high local interhost correlations in parasite load strongly influence the spatial distribution of parasites. Local hotspots could influence control and conservation of parasites.
Topics: Animals; Binomial Distribution; Demography; Ecology; Host-Parasite Interactions; Humans; Models, Biological; New Zealand; Parasite Load; Parasites; Population Dynamics
PubMed: 27994156
DOI: 10.1073/pnas.1618803114 -
BMC Bioinformatics May 2017Sample size calculation and power estimation are essential components of experimental designs in biomedical research. It is very challenging to estimate power for...
BACKGROUND
Sample size calculation and power estimation are essential components of experimental designs in biomedical research. It is very challenging to estimate power for RNA-Seq differential expression under complex experimental designs. Moreover, the dependency among genes should be taken into account in order to obtain accurate results.
RESULTS
In this paper, we propose a simulation based procedure for power estimation using the negative binomial distribution and assuming a generalized linear model (at the gene level) that considers the dependence between gene expression level and its variance (dispersion) and also allows equal or unequal dispersion across conditions. We compared the performance of both Wald test and likelihood ratio test under different scenarios. The null distribution of the test statistics was simulated for the desired false positive control to avoid excess false positives with the usage of an asymptotic chi-square distribution. We applied this method to the TCGA breast cancer data set.
CONCLUSIONS
We provide a framework for power estimation of RNA-Seq data. The proposed procedure is able to properly control the false positive error rate at the nominal level.
Topics: Binomial Distribution; Breast Neoplasms; False Positive Reactions; Gene Expression Profiling; Humans; Linear Models; Sequence Analysis, RNA; Statistics as Topic
PubMed: 28468606
DOI: 10.1186/s12859-017-1648-2 -
BMC Bioinformatics Sep 2016RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays....
BACKGROUND
RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays. Although statistical methods that have been developed for microarray data can be applied to RNA-Seq data, they are not ideal due to the discrete nature of RNA-Seq data. The Poisson distribution and negative binomial distribution are commonly used to model count data. Recently, Witten (Annals Appl Stat 5:2493-2518, 2011) proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson assumption may not be as appropriate as the negative binomial distribution when biological replicates are available and in the presence of overdispersion (i.e., when the variance is larger than or equal to the mean). However, it is more complicated to model negative binomial variables because they involve a dispersion parameter that needs to be estimated.
RESULTS
In this paper, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes' rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze two real RNA-Seq data sets to demonstrate the advantages of our method in real-world applications.
CONCLUSIONS
We have developed a new classifier using the negative binomial model for RNA-seq data classification. Our simulation results show that our proposed classifier has a better performance than existing works. The proposed classifier can serve as an effective tool for classifying RNA-seq data. Based on the comparison results, we have provided some guidelines for scientists to decide which method should be used in the discriminant analysis of RNA-Seq data. R code is available at http://www.comp.hkbu.edu.hk/~xwan/NBLDA.R or https://github.com/yangchadam/NBLDA.
Topics: Bayes Theorem; Binomial Distribution; Discriminant Analysis; Humans; RNA; Sequence Analysis, RNA; Transcriptome
PubMed: 27623864
DOI: 10.1186/s12859-016-1208-1 -
Frontiers in Cardiovascular Medicine 2021Although β-blockers impressively reduce mortality in chronic heart failure (CHF), there are concerns about negative inotropic effects and worsening of hemodynamics in...
Although β-blockers impressively reduce mortality in chronic heart failure (CHF), there are concerns about negative inotropic effects and worsening of hemodynamics in acute decompensated heart failure. May receptor theory dispel these concerns and confirm clinical practice to use β-blockers? In CHF, concentrations of catecholamines at the β-adrenoceptors usually exceed their dissociation constants ( s). The homodimeric β-adrenoceptors have a receptor reserve and display negative cooperativity. We considered the binomial distribution of occupied receptor dimers with respect to the interaction of an exogenous β-blocker and elevated endogenous agonist concentrations > [ s], corresponding to an elevated sympathetic tone. Modeling based on binomial distribution suggests that despite the presence of a low concentration of the antagonist, the activation of the dimer receptors is higher than that in its absence. Obviously, the antagonist improves the ratio of the dimer receptors with only single agonist activation compared with the dimer receptors with double activation. This leads to increased positive inotropic effects of endogenous catecholamines due to a β-blocker. To understand the positive inotropic sequels of β-blockers in CHF is clinically relevant. This article may help to eliminate the skepticism of clinicians about the use of β-blockers because of their supposed negative inotropic effect, since, on the contrary, a positive inotropic effect can be expected for receptor-theoretical reasons.
PubMed: 34179127
DOI: 10.3389/fcvm.2021.639562 -
Frontiers in Microbiology 2021Conventional regression analysis using the least-squares method has been applied to describe bacterial behavior logarithmically. However, only the normal distribution is...
Conventional regression analysis using the least-squares method has been applied to describe bacterial behavior logarithmically. However, only the normal distribution is used as the error distribution in the least-squares method, and the variability and uncertainty related to bacterial behavior are not considered. In this paper, we propose Bayesian statistical modeling based on a generalized linear model (GLM) that considers variability and uncertainty while fitting the model to colony count data. We investigated the inactivation kinetic data of with an initial cell count of 10 and the growth kinetic data of with an initial cell count of 10. The residual of the GLM was described using a Poisson distribution for the initial cell number and inactivation process and using a negative binomial distribution for the cell number variation during growth. The model parameters could be obtained considering the uncertainty by Bayesian inference. The Bayesian GLM successfully described the results of over 50 replications of bacterial inactivation with average of initial cell numbers of 10, 10, and 10 and growth with average of initial cell numbers of 10, 10, and 10. The accuracy of the developed model revealed that more than 90% of the observed cell numbers except for growth with initial cell numbers of 10 were within the 95% prediction interval. In addition, parameter uncertainty could be expressed as an arbitrary probability distribution. The analysis procedures can be consistently applied to the simulation process through fitting. The Bayesian inference method based on the GLM clearly explains the variability and uncertainty in bacterial population behavior, which can serve as useful information for risk assessment related to food borne pathogens.
PubMed: 34248886
DOI: 10.3389/fmicb.2021.674364 -
Statistics in Medicine May 2017In meta-analysis of odds ratios (ORs), heterogeneity between the studies is usually modelled via the additive random effects model (REM). An alternative, multiplicative...
In meta-analysis of odds ratios (ORs), heterogeneity between the studies is usually modelled via the additive random effects model (REM). An alternative, multiplicative REM for ORs uses overdispersion. The multiplicative factor in this overdispersion model (ODM) can be interpreted as an intra-class correlation (ICC) parameter. This model naturally arises when the probabilities of an event in one or both arms of a comparative study are themselves beta-distributed, resulting in beta-binomial distributions. We propose two new estimators of the ICC for meta-analysis in this setting. One is based on the inverted Breslow-Day test, and the other on the improved gamma approximation by Kulinskaya and Dollinger (2015, p. 26) to the distribution of Cochran's Q. The performance of these and several other estimators of ICC on bias and coverage is studied by simulation. Additionally, the Mantel-Haenszel approach to estimation of ORs is extended to the beta-binomial model, and we study performance of various ICC estimators when used in the Mantel-Haenszel or the inverse-variance method to combine ORs in meta-analysis. The results of the simulations show that the improved gamma-based estimator of ICC is superior for small sample sizes, and the Breslow-Day-based estimator is the best for n⩾100. The Mantel-Haenszel-based estimator of OR is very biased and is not recommended. The inverse-variance approach is also somewhat biased for ORs≠1, but this bias is not very large in practical settings. Developed methods and R programs, provided in the Web Appendix, make the beta-binomial model a feasible alternative to the standard REM for meta-analysis of ORs. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Topics: Bias; Binomial Distribution; Data Interpretation, Statistical; Humans; Meta-Analysis as Topic; Models, Statistical; Odds Ratio; Probability
PubMed: 28124446
DOI: 10.1002/sim.7233 -
Trends in Hearing 2022A new sentence recognition test in Mandarin Chinese was developed and validated following the principles and procedures of development of the English AzBio sentence...
A new sentence recognition test in Mandarin Chinese was developed and validated following the principles and procedures of development of the English AzBio sentence materials. The study was conducted in two stages. In the first stage, 1,020 sentences spoken by 4 talkers (2 males and 2 females) were processed through a 5-channel noise vocoder and presented to 17 normal-hearing Mandarin-speaking adults for recognition. A total of 600 sentences (150 from each talker) in the range of approximately 62 to 92% correct (mean = 78.0% correct) were subsequently selected to compile 30, 20-sentence lists. In the second stage, 30 adult CI users were recruited to verify the list equivalency. A repeated-measures analysis of variance followed by the post hoc Tukey's test revealed that 26 of the 30 lists were equivalent. Finally, a binomial distribution model was adopted to account for the inherent variability in the lists. It was found that the inter-list variability could be best accounted for with a 65-item binomial distribution model. The lower and upper limits of the 95% critical differences for one- and two-list recognition scores were then generated to provide guidance for detection of a significant difference in recognition scores in clinical settings. The final set of 26 equivalent lists contains sentence materials more difficult than those found in other speech audiometry materials in Mandarin Chinese. This test should help minimize the ceiling effects when testing sentence recognition in Mandarin-speaking CI users.
Topics: Adult; Male; Female; Humans; Speech Perception; Audiometry, Speech; Language; Noise; China
PubMed: 36303434
DOI: 10.1177/23312165221134007 -
Clinical Infectious Diseases : An... Apr 2018Substantial heterogeneity in measles outbreak sizes may be due to genotype-specific transmissibility. Using a branching process analysis, we characterize differences in...
BACKGROUND
Substantial heterogeneity in measles outbreak sizes may be due to genotype-specific transmissibility. Using a branching process analysis, we characterize differences in measles transmission by estimating the association between genotype and the reproduction number R among postelimination California measles cases during 2000-2015 (400 cases, 165 outbreaks).
METHODS
Assuming a negative binomial secondary case distribution, we fit a branching process model to the distribution of outbreak sizes using maximum likelihood and estimated the reproduction number R for a multigenotype model.
RESULTS
Genotype B3 is found to be significantly more transmissible than other genotypes (P = .01) with an R of 0.64 (95% confidence interval [CI], .48-.71), while the R for all other genotypes combined is 0.43 (95% CI, .28-.54). This result is robust to excluding the 2014-2015 outbreak linked to Disneyland theme parks (referred to as "outbreak A" for conciseness and clarity) (P = .04) and modeling genotype as a random effect (P = .004 including outbreak A and P = .02 excluding outbreak A). This result was not accounted for by season of introduction, age of index case, or vaccination of the index case. The R for outbreaks with a school-aged index case is 0.69 (95% CI, .52-.78), while the R for outbreaks with a non-school-aged index case is 0.28 (95% CI, .19-.35), but this cannot account for differences between genotypes.
CONCLUSIONS
Variability in measles transmissibility may have important implications for measles control; the vaccination threshold required for elimination may not be the same for all genotypes or age groups.
Topics: Adolescent; Binomial Distribution; California; Child; Disease Eradication; Disease Outbreaks; Genotype; Humans; Likelihood Functions; Measles; Measles Vaccine; Measles virus; Models, Theoretical; Species Specificity; Vaccination
PubMed: 29228134
DOI: 10.1093/cid/cix974 -
Applied Psychological Measurement Jan 2023Diagnostic classification models (DCMs) have been used to classify examinees into groups based on their possession status of a set of latent traits. In addition to...
Diagnostic classification models (DCMs) have been used to classify examinees into groups based on their possession status of a set of latent traits. In addition to traditional item-based scoring approaches, examinees may be scored based on their completion of a series of small and similar tasks. Those scores are usually considered as count variables. To model count scores, this study proposes a new class of DCMs that uses the negative binomial distribution at its core. We explained the proposed model framework and demonstrated its use through an operational example. Simulation studies were conducted to evaluate the performance of the proposed model and compare it with the Poisson-based DCM.
PubMed: 36425286
DOI: 10.1177/01466216221124604 -
Journal of Insect Science (Online) 2016Field infestation and spatial distribution of introduced Bactrocera carambolae Drew and Hancock and native species of Anastrepha in common guavas [Psidium guajava (L.)]...
Field infestation and spatial distribution of introduced Bactrocera carambolae Drew and Hancock and native species of Anastrepha in common guavas [Psidium guajava (L.)] were investigated in the eastern Amazon. Fruit sampling was carried out in the municipalities of Calçoene and Oiapoque in the state of Amapá, Brazil. The frequency distribution of larvae in fruit was fitted to the negative binomial distribution. Anastrepha striata was more abundant in both sampled areas in comparison to Anastrepha fraterculus (Wiedemann) and B. carambolae The frequency distribution analysis of adults revealed an aggregated pattern for B. carambolae as well as for A. fraterculus and Anastrepha striata Schiner, described by the negative binomial distribution. Although the populations of Anastrepha spp. may have suffered some impact due to the presence of B. carambolae, the results are still not robust enough to indicate effective reduction in the abundance of Anastrepha spp. caused by B. carambolae in a general sense. The high degree of aggregation observed for both species suggests interspecific co-occurrence with the simultaneous presence of both species in the analysed fruit. Moreover, a significant fraction of uninfested guavas also indicated absence of competitive displacement.
Topics: Animal Distribution; Animals; Brazil; Food Chain; Fruit; Insect Control; Larva; Psidium; Tephritidae
PubMed: 27638949
DOI: 10.1093/jisesa/iew076